Team 5103 — University of Maryland
2026-02-01
Competition datasets — bloom DOY for 5 target sites:
| Dataset | Location | Records | Span |
|---|---|---|---|
kyoto.csv |
Kyoto, Japan | 837 | 812 – 2025 |
washingtondc.csv |
Washington DC | 106 | 1921 – 2025 |
liestal.csv |
Liestal, Switzerland | 132 | 1895 – 2025 |
vancouver.csv |
Vancouver, Canada | 4 | 2022 – 2025 |
nyc.csv |
New York City | 3 | 2019 – 2025 |
Auxiliary datasets — broaden geographic & temporal coverage:
| Dataset | Records |
|---|---|
japan.csv (regional bloom dates) |
6,573 |
meteoswiss.csv (Swiss phenology) |
6,642 |
south_korea.csv |
994 |
USA-NPN enrichment (NYC):
Total training pool: ~14,598 rows
Model A — Local Trend (per site)
bloom_doy ~ year + year²Model B — Pooled GAM (all sites jointly)
\[\text{DOY} \sim s(\text{year}) + s(\text{lat}, \text{long}) + s(\text{alt}) + s(\text{site\_obs}) + \text{source}\]
Ensemble blending (data-driven)
\[w_A = \tfrac{1/\text{MAE}_A}{1/\text{MAE}_A + 1/\text{MAE}_B}\]
| Model | Backtest MAE |
|---|---|
| Local (A) | 7.01 days |
| GAM (B) | 7.21 days |
| Ensemble | 6.1 days |
Prediction intervals — split-conformal: 90th-percentile of backtest |residuals| per location → half-width of interval
| Location | DOY | Date | Interval | Width |
|---|---|---|---|---|
| kyoto | 90 | Mar 31 | Mar 21 – Apr 10 | 20 |
| liestal | 88 | Mar 29 | Mar 19 – Apr 06 | 18 |
| newyorkcity | 92 | Apr 02 | Mar 26 – Apr 10 | 15 |
| vancouver | 92 | Apr 02 | Mar 18 – Apr 18 | 31 |
| washingtondc | 83 | Mar 24 | Mar 17 – Mar 31 | 14 |
Sum of squared interval widths: 2106
| Metric | Value |
|---|---|
| Ensemble backtest MAE | 6.1 days |
| R vs Python mean gap | 1.8 days |
| Local weight (\(w_A\)) | 50.7% |
| GAM weight (\(w_B\)) | 49.3% |
Team 5103 — University of Maryland
Tip
All code, data, and outputs are publicly available and fully reproducible.
Team 5103 · University of Maryland · Cherry Blossom Peak Bloom Prediction 2026